Target preparation
So, you want to de novo design a binder for a target protein …
Things to consider
- Do I have an experimental structure of the target ?
- Is it high quality / well defined in the regions required ?
- If I don’t have an experimental structure, are computational models (eg Alphafold) reliable for this target - do I believe them based on other data, biology ?
- Is this a good experimental/clinical target ?
- What characteristics should be designed binders have to be better/cheaper/safer/unique relative to existing tools or therapies ?
- In the assay or biological system, will the target surfaces be accessible, or how will my binder get there ?
- How will produce and test my de novo binders ?
- Do I have a reliable medium-high throughput assay ?
Truncation, trimming, cropping
“Truncating a target is an art.” – Nathaniel Bennett, RFdiffusion README.md
For RFdiffusion, runtime scales at O(N^2) where N is the number of residues.
For BindCraft, 500 residues (target+binder) uses ~30Gb GPU memory.
It is very common, and good practise, to remove parts of the target coordinates to speed up computation of binders, and make better use of in-demand GPU resources. Sometimes truncation is the difference between practical (24G VRAM), possible (A100-80G or GH200-96G) and not (yet) possible ( >141G VRAM on a single device).
- Try to keep distinct (sub)domains intact
- Try to avoid exposing the hydrophobic core
- Don’t truncate too close to your proposed binding interface and hotspots
Grab the cooridinates for the PD-1/PD-L1 complex 3BIK (legacy PDB format).
We want to use PD-L1 (chain A) as our target and block the binding of PD-1 (chain B).
Propose a truncated version of PD-L1 (chain A) we could use to design a de novo binder against.
You can use ChimeraX/Pymol/whatever if you have it, or just edit the PDB file in a text editor to keep just the
ATOMrecords for the chain and residue ranges you want to keep.
Save the truncated coordinates as PDL1.pdb
Hotspot selection
In the context of de novo binder design, a ‘hotspot’ is residue on the target that is likely to make favourable interactions with residues on the de novo binder. Hotspots help guide the location and characteristics of the binder-target interface and can have a large (not always predictable) impact on in silico success rates.
For RFdiffusion, 3–6 hotspots are recommended (RFdiffusion attempts to put between 0 and 20% of these hotspots, at random, within 10Å of a binder Cβ atom, while making any other contacts that appear statistically plausible to the model).
For BindCraft, zero to X hotspots. Starting with a small number (1 - 3 ?) of hotspots is probably best.
Aromatic and hydrophobic residues (F, Y, W, I, L, M) tend to make the best hotspots, but you don’t need to restrict your choices to only these residue types.
Look at the residues at the interface of PD-1/PD-L1 in 3BIK.
Propose three residues on PD-L1 (chain A) we might choose as hotspots to design a de novo binder to block interaction of PD-1.